Non-linear predictive vector quantization of feature vectors for distributed speech recognition
نویسندگان
چکیده
In this paper, we present a non linear prediction scheme based on a Multi-Layer Perceptron for Predictive Vector Quantization (PVQ-MLP) of MFCC for very low bit-rate coding of acoustic features in distributed speech recognition (DSR). Certain applications like voice enabled web-browsing or speech controlled processes in large industrial plants, where hundreds of users access simultaneously to the same ASR server can benefit from this substantial bit-rate reduction. Experimental results obtained on a large vocabulary task show an improved performance of PVQ-MLP in terms of prediction gain and WER compared to a linear prediction scheme, especially when low bit-rates are evaluated. Using PVQ-MLP the bit-rate can be reduced up to 1.8 kbps resulting in a reduction of 66% with respect to the ETSI standards (4.4 kbps) with a WER degradation lower than 5% compared to a system without quantization.
منابع مشابه
Non-linear Compression of Fea Transform Coding and Non-unif
This paper uses transform coding for compressing feature vectors in distributed speech recognition applications. Feature vectors are first grouped together into non-overlapping blocks and a transformation applied. A non-uniform allocation of bits to the elements of the resultant matrix is based on their relative information content. Analysis of the amplitude distribution of these elements indic...
متن کاملPredictive vector quantization using the M-algorithm for distributed speech recognition
In this paper we present a predictive vector quantizer for distributed speech recognition that makes use of a delayed decision coding scheme, performing the optimal codeword searching by means of the M-algorithm. In single-path predictive vector quantization coders, each frame is coded with the closest codeword to the prediction error. However, prediction errors and quantization errors of futur...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کاملLow bit-rate feature vector compression using transform coding and non-uniform bit allocation
This paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-overlapping blocks and applying a transformation to give a more compact matrix representation. Both Karhunen-Loeve and discrete cosine transforms are considered. Following transforma...
متن کاملDifferential vector quantization of feature vectors for distributed speech recognition
Distributed speech recognition arises for solving computational limitations of mobile devices like PDAs or mobile phones. Due to bandwidth restrictions, it is necessary to develop efficient transmission techniques of acoustic features in Automatic Speech Recognition applications. This paper presents a technique for compressing acoustic feature vectors based on Differential Vector Quantization. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010